|
Crowd counting using multi-scale multi-task convolutional neural network
CAO Jinmeng, NI Rongrong, YANG Biao
Journal of Computer Applications
2019, 39 (1):
199-204.
DOI: 10.11772/j.issn.1001-9081.2018051132
Crowd counting has played a significant role in the field of intelligent surveillance. Concerning the problem of scale variation, non-uniform density distribution and partial occlusion of crowds, a method of crowd counting using Multi-scale Multi-task Convolutional Neural Network (MMCNN) was proposed to solve existing challenges in crowd counting. Initially, a novel adaptive human-shaped kernel was used to generate a density map which described the population information, and the partial occlusion was eliminated. Then, scale variation was handled through constructing a multi-scale convolutional neural network and non-uniform density distribution was resolved by the multi-task learning mechanism, which simultaneously estimate the density map and density level of crowds. Further, a weighted loss function was proposed to improve the accuracy of crowd counting. Evaluations in UCF_CC_50 and World Expo'10 datasets revealed the effectiveness of the proposed adaptive human-shaped kernel. The experimental results show that, compared with the method proposed by Sindagi et al. (SINDAGI V A, PATEL V M. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway, NJ:IEEE, 2017:1-6), the Mean Absolute Error (MAE) and Mean Squared Error (MSE) of the proposed method in UCF_CC_50 dataset is decreased by 1.7 and 45 respectively. Compared with the method proposed by Zhang et al. (ZHANG Y, ZHOU D, CHEN S, et al. Single-image crowd counting via multi-column convolutional neural network. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2016:589-597), the MAE of the proposed method in World Expo'10 dataset is decreased by 1.5. Simultaneously, evaluations in practical bus videos with an error of approximately 0-3, which verifies the practicability of the proposed counting approach.
Reference |
Related Articles |
Metrics
|
|